[SPARK-2995][MLLIB] add ALS.setIntermediateRDDStorageLevel#1913
[SPARK-2995][MLLIB] add ALS.setIntermediateRDDStorageLevel#1913mengxr wants to merge 2 commits intoapache:masterfrom
Conversation
|
QA tests have started for PR 1913. This patch merges cleanly. |
|
QA results for PR 1913: |
|
@mengxr: I would prefer setIntermediateRDDStorageLevel. |
|
QA tests have started for PR 1913. This patch merges cleanly. |
|
QA results for PR 1913: |
As mentioned in SPARK-2465, using `MEMORY_AND_DISK_SER` for user/product in/out links together with `spark.rdd.compress=true` can help reduce the space requirement by a lot, at the cost of speed. It might be useful to add this option so people can run ALS on much bigger datasets. Another option for the method name is `setIntermediateRDDStorageLevel`. Author: Xiangrui Meng <meng@databricks.com> Closes #1913 from mengxr/als-storagelevel and squashes the following commits: d942017 [Xiangrui Meng] rename to setIntermediateRDDStorageLevel 7550029 [Xiangrui Meng] add ALS.setIntermediateDataStorageLevel (cherry picked from commit 69a57a1) Signed-off-by: Xiangrui Meng <meng@databricks.com>
|
Merged into both master and branch-1.1. |
As mentioned in SPARK-2465, using `MEMORY_AND_DISK_SER` for user/product in/out links together with `spark.rdd.compress=true` can help reduce the space requirement by a lot, at the cost of speed. It might be useful to add this option so people can run ALS on much bigger datasets. Another option for the method name is `setIntermediateRDDStorageLevel`. Author: Xiangrui Meng <meng@databricks.com> Closes apache#1913 from mengxr/als-storagelevel and squashes the following commits: d942017 [Xiangrui Meng] rename to setIntermediateRDDStorageLevel 7550029 [Xiangrui Meng] add ALS.setIntermediateDataStorageLevel
As mentioned in SPARK-2465, using
MEMORY_AND_DISK_SERfor user/product in/out links together withspark.rdd.compress=truecan help reduce the space requirement by a lot, at the cost of speed. It might be useful to add this option so people can run ALS on much bigger datasets.Another option for the method name is
setIntermediateRDDStorageLevel.